Goto

Collaborating Authors

 tesla cybercab


From Simple to Professional: A Combinatorial Controllable Image Captioning Agent

Wang, Xinran, Diao, Muxi, Li, Baoteng, Zhang, Haiwen, Liang, Kongming, Ma, Zhanyu

arXiv.org Artificial Intelligence

The Controllable Image Captioning Agent (CapAgent) is an innovative system designed to bridge the gap between user simplicity and professional-level outputs in image captioning tasks. CapAgent automatically transforms user-provided simple instructions into detailed, professional instructions, enabling precise and context-aware caption generation. By leveraging multimodal large language models (MLLMs) and external tools such as object detection tool and search engines, the system ensures that captions adhere to specified guidelines, including sentiment, keywords, focus, and formatting. CapAgent transparently controls each step of the captioning process, and showcases its reasoning and tool usage at every step, fostering user trust and engagement.


Elon Musk's Tesla Cybercab is a hollow promise of a robotaxi future

New Scientist

At a glitzy event held at Warner Bros. Studios Burbank in California, Tesla CEO Elon Musk unveiled the Cybercab: a robotic, self-driving taxi. Musk said that the vehicle, which has two seats, no steering wheel and no pedals, would be available before 2027. "I think it's going to be a glorious future," he told the crowd on 10 October. Meanwhile, just a few kilometres south in Los Angeles, people are already being ferried about by autonomous vehicles operated by Waymo.